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We study the Boson Peak phenomenology experimentally observed in globular proteins by means of elastic network models. These models 
are suitable for an analytic treatment in the framework of Euclidean Random Matrix theory, whose predictions can be numerically tested 
on real proteins structures. We find that the emergence of the Boson Peak is strictly related to an intrinsic mechanical instability of the 
protein, in close similarity to what is thought to happen in glasses. The biological implications of this conclusion are also discussed by 
focusing on a representative case study. 

1 Introduction 

The analogy between proteins and structural glasses has a long story and solid foundations. Among the 
different physical properties shared by the two classes of systems one can mention: (i) the anomalous 
specific heat [1]; (ii) the slow energy relaxation processes [2-4]; (iii) the existence of a dynamical transition 
as witnessed by the temperature dependence of the the atomic mean squared displacements [5,6]; (iv) 
the excess of low frequency modes (the Boson Peak) in dynamical spectra obtained by inelastic neutron 
scattering experiments [7-9]. Although a somewhat generic explanation of the above experimental facts 
can be given in terms of a rugged energy landscape [10-12], a comprehensive explanation supported by 
solvable models is still missing. Incidentally, it is no surprise that proteins have been proposed as the 
paradigm of complexity in biological systems [13]. 

Here we focus on one of the glass-like features of proteins, the Boson Peak (BP). The latter is formally 
defined as the low frequency peak in the function g(uj)/gD(uj), where g(ui) is the vibrational density 
of states (DOS) as measured experimentally, and go(^) cj 2 is the Debye behavior in the cj — ► 
limit. A peak in this function would then represent an excess of low vibrational modes with respect to 
a perfect harmonic crystal. The very origin of these new modes, related to the amorphous nature of the 
low energy configurations, has been the subject of a long debate in the recent years and yet there is 
no general agreement. As for structural glasses, people have proposed "soft-potential" theories including 
strong anharmonicities [14,15], harmonic lattices with random springs [16,17], extensions of standard mode 
coupling theory to the "frozen" state [18,19], and topologically disordered models [20,21]. Recently, the 
possibility that the BP is a universal feature of weakly connected systems has been suggested [22]. 

In this paper we will take the point of view that proteins can be considered to some extent as "random" 
structures, akin to those of structural glasses. In a sense, this constitutes our working hypothesis to be 
tested a posteriori. However, the rationale behind it can be identified in the empirical observation of 
the large scale structural properties of proteins. For instance, the pair correlation function of proteins at 
distances larger than « 10 -i- 15 A can be hardly distinguished from that of a set of points randomly 
distributed in an equivalent volume [23]. As a consequence, the "random" approximation is expected to 
work reasonably well for studying the behaviour of low frequency modes. 
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Figure 1. Left: Schematic cartoon of an Elastic Network Model. The protein is seen as a polymer-like chain coarse-grained at the 
amino-acid level with elastic pairwise interactions. Right: Pictorial representation of the phase diagram predicted by the ERM theory. 

The rest of the paper is organized as follows. In section El we introduce the elastic network model of 
protein fluctuations and we briefly justify its usage in this context. In section [HI we discuss the harmonic 
approximation and the theoretical framework of the Euclidean Random Matrix (ERM) theory, whose 
main findings are also briefly summarized as a reference for the interpretation of our numerical results. 
In section 0] we report our results on the BP and on the role of the critical parameter, as well as a study 
of the relationship between functional modes and proteins' stability. Our comments and suggestions for 
further experimental studies are discussed in the last section. 



2 Model building: Elastic Networks 

The introduction of coarse-grained elastic network models in the context of protein dynamics is quite 
recent [24]. In fact, as a direct consequence of the known complexity of proteins' energy landscapes, all 
microscopic details are usually deemed essential to characterize their functional dynamics. As a result, 
people have long failed to realize that most features of the large- and medium-scale dynamics of proteins 
close to their native state, i.e. those related to biological function, can be successfully reproduced by simple 
harmonic interactions between amino-acids [25-28]. 

In our case, we aim at describing a universal feature of globular proteins at low frequency. Hence, the 
class of elastic network models appears a fortiori the correct choice to capture the main aspects of the 
BP phenomenology. In this spirit, the fine structural detail of microscopic interactions will be completely 
neglected, as well as the information contained in the amino-acid sequence. We thus coarse-grain the protein 
structure at the amino-acid level and replace each residue by a single particle whose equilibrium position 
coincides with that of the a-carbon as measured e.g. through X-ray crystallography. In practice, this 
amounts to dealing with a polymer folded exactly like the real protein. Each pair of particles falling within 
a fixed cutoff inter-distance are assumed to interact harmonically. More precisely, the global Hamiltonian 
of the system is (see the left panel of Fig. [I] for the notations): 

W[{r«};{rS°)}]=X;^(Ag ) )(|A u |-|AW|) 2 . (1) 

The interaction stiffness can be also allowed to decrease smoothly with the pair distance as e.g. K(A^) = 
ftexp [ — (A^/r c ) 2 ] (Gaussian model), but a sharp cutoff (i.e. stepwise) form of the type K(A^) = 
K@[r c — A^] could also be used. In this paper, we will adopt the Gaussian model. 
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The parameter k sets the physical units for force constants, and is usually fixed by requiring the theo- 
retical mean squared displacements of residues to match the experimental ones as determined from X-ray 
spectra for a given choice of the cutoff r c [25,26]. In principle, the latter should be tuned by fitting the 
low-frequency portion of experimental spectra at temperatures below the dynamical transition, where the 
protein fluctuates harmonically within the native minimum. However, such studies have never appeared in 
the literature to the best of our knowledge. The usual, cheaper alternative is to compare the theory with 
numerical spectra obtained by modeling the protein dynamics through all-atom force fields [27]. By doing 
so, one obtains p c « 3 A in an all-atom representation, which reduces to r c « {N a ) l ^p c « 8 A when the 
protein structure is coarse-grained at the residue level. In the following, such value wil be referred to as 
the optimal cutoff length scale. 



3 Harmonic approximation and Euclidean Random Matrix Theory 

Taking the harmonic approximation of the Hamiltonian (JJJ one is led to the quadratic form TL ~ |((/?, M(p), 

where the Hessian matrix Ai depends on the a-carbon coordinates {r^}. As we anticipated in section 
[IJ we will investigate the consequences of the hypothesis that these positions are random in space. Under 
this assumption, the matrix AA falls into the broad class of Euclidean Random Matrices [29]. In general 
terms, our problem amounts to taking N points at random in a volume V, assume that they interact 
according to some pair potential v(r) and then compute the vibrational DOS of this topologically disordered 
system in the thermodynamic limit TV, V — ► oc, while the density N/V = p stays finite. After some 
efforts [20,21,30,31], the following regimes have been identified under rather generic assumptions (see also 
the right panel of Fig. QJ: 

• P ^ Pa high-density / low-temperature: the DOS is found to follow exactly the Debye behavior, g{uS) oc 
cj 2 . An amorphous solid behaves like a perfect crystal in the limit of very large density, meaning that 
an infinitely compact medium does not feel at all the presence of the disorder. 

• P ^ Pc- a peak in g(uo)/uo 2 emerges at upp, where ujpp ~ (p — p c ), and its height is found to diverge 
like hpp ~ (p — Pc)~ 1 ^ 2 - This peak is identified with the BP and is due to the fact that the system 
"feels" the presence of a mechanical instability at p c . The actual value of p c depends on the details of 
the interaction. 

• p < p c : The characterization of the low density phase depends on the spatial nature of the interactions. 
In a vectorial model [21] the Hessian matrix has negative eigenvalues (i.e. imaginary frequencies). This 
phenomenon is usually referred to as a topological phase transition. The DOS at zero energy plays the 
role of an order parameter and one finds that 5(0) ~ (p c - p) 1/2 . 

To summarize, the system undergoes a phase transition, from a "solid" phase characterized by the 
minima of the energy surface, to a "liquid" phase where saddles become relevant for the high-frequency 
dynamics. If the Hessian matrix is positively defined and thus no negative modes are possible, the DOS in 
the liquid phase just reduces to a delta function at zero frequency [20]. It is also interesting to mention that 
the mean field exponents are consistent with the numerical findings on realistic glass-forming materials [31]. 

In view of applying this approach to our protein model, a control parameter has to be identified. The 
most natural choice is the interaction cutoff r c . Given that the average connectivity (c) ~ Tr.M ~ r\, r c 
is indeed a global measure of compactness of the protein. Very much like the temperature, this parameter 
is expected to signal the approach of an instability akin to a liquid-glass transition: if r c is very large, the 
protein is extremely rigid and one expects its DOS to follow the Debye law. As r c decreases, the protein 
looses stability and becomes an extremely flexible object, until it unfolds and eventually melts. If this idea 
is correct, one should detect a trace of this progressive structural change by looking at the vibrational 
DOS upon moving r c . This is the problem we address in the next section. 
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Figure 2. The BP phenomenology for the PDZ binding domain. Left panels: DOS and DOS divided by the Debye law. Solid line 
r c = 5.5 A, dotted line r c = 5.7 A, dashed line r c = 6 A, dot-dashed line r c = 6.5 A, dot-dot-dashed line r c = 7 A. Right panels: scaling 
of the BP frequency and height with the cutoff r c . Solid lines represent the mean field prediction, ujbp ~ {r c — r*), 

h B p ~ (rc-r*)- 1 / 2 . 

4 Results 

4.1 T/ie Boson Peak 

In Fig. [2] we report our results for the vibrational DOS of one particular protein out of the examined 
ensemble, namely the PDZ binding domain. In order to obtain a smooth and more reliable DOS, we 
generate a large number (order 10 3 ) of surrogates whose topology is compatible with the experimentally 
determined native structure [23]. The low frequency region shows a strong dependence on r c , since a non- 
trivial excess of modes develops as a peak which eventually diverges for some critical r*. We identify this 
peak as the BP. The position and height of the BP are plotted in the right panels versus the interaction 
cutoff r c . The mean field predictions are also shown to perform reasonably well in the vicinity of the phase 
transition. 

We have repeated this analysis on several proteins, choosing structures in a wide range of sizes (from 51 
to 578 residues), and we always found an identical phenomenology. In order to clarify the physical origin 
of this behavior and its significance in biological terms, we investigate in the next subsection the role of 
the critical interaction cutoff r* by measuring its correlation with structural properties of the protein. 

4.2 The critical cutoff r*; correlation with structural properties 

In Table E] we report measures of selected geometrical and structural indicators for a choice of proteins of 
different sizes, along with the correlations with the corresponding value of the critical cutoff distance r*. 

• We find a positive correlation between r* and the number of residues N. 

• We also find a positive correlation with a measure of the protein volume V estimated by constructing 
the equivalent ellipsoid according to the procedure described in ref. [32]. Such ellipsoid is characterized 
by three principal radii ai, and as, which are obtained by diagonalising the symmetric matrix A^ v — 

r i r i where /i, v — 1,2,3 are Cartesian labels and T{ are the amino-acid coordinates. In the case 
of a continuous distribution of mass inside the ellipsoid, it can be shown that the three radii are simply 
given by = \/5A^ where Ai, A2, A3 are the eigenvalues of the matrix A. 

• There clearly exists a negative correlation between r* and the packing coefficient. The latter is a non- 
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dimensional measure of compactness defined as 

4tt fN\ f d 



P 3 \V J V 2 

where do — 3.83 A is the distance between two consecutive residues along the protein main chain. 
The critical values r* turn out to be positively correlated with the fraction of flexible residues. Such 
measure is calculated by flagging as highly fluctuating a residue whose average fluctuation over the first 
three slow modes is above its overall average fluctuation [33]. Average fluctuations over the first n modes 
are computed as 

2x _3^Tl^,_ 2i 



k n 

k=i 

where ^ is the k— th eigenvector of the Hessian matrix and uj\ the corresponding eigenvalue. 
• Finally, a poor correlation is found between r* and a combined measure of secondary structure content, 
defined as the fraction of residues classified either as from a-helices of /3-sheets [34] . 

Overall, the measured correlations confirm that r* is an indirect measure of structural stability of a 
protein, small values of r* corresponding to highly compact (higher packing coefficient) and less flexible 
structures. Moreover, r* is shown to increase with extensive quantities such as N and V , which may reflect 
the inherent increasing loss of globularity of larger structures. Surprisingly enough, the degree of secondary 
structure content does not prove to be an accurate indicator of stability in this sense. However, this only 
indicates that r* is mostly sensitive to the large-scale degree of compactness, relevant to the robustness of 
the lowest vibrational modes. In other words, the local high stability of secondary motifs is not equivalent 
to the large-scale one as probed by r*. This is also evident from the poor correlation (-0.19) existing 
between the secondary structure content and the packing coefficient p. 

To be more clear, what we have found is that the excess of modes appearing in the low frequency region 
of a protein's DOS is a precursory feature that flags the approaching of a topological instability, whose 
meaning is to be traced back to the analogy with glass-forming materials. In particular, as it is the case 
for the Gaussian model in glasses [20,21], such excess of modes should be interpreted as a precursor of 
the transition within a model that by definition becomes meaningless at the critical point. In fact, as the 
interaction cutoff r c is decreased below the typical range of the first off-chain coordination shell, the model 
describes protein conformations that start loosening until they eventually unfold, thus becoming closer 
and closer to liquid-like assemblies of amino-acids. From a biological point of view, all that hints at an 
inherent effort of reconciling between spatial properties of liquids, i.e. increased degree of mobility, and 
the necessity of maintaining the structural stability of compact biological agents. 



4.3 Critical cutoff and functional motions: the case of PDZ binding domains 

The connection of the structural instability of a protein as revealed by the Boson peak analysis and the 
details of its functional dynamics is a matter of investigation deserving a systematic study on its own 
right. Here, we give a an example of such analysis by sticking to the specific example of the PDZ binding 
domain. 

Postsynaptic density-95/disks large/zonula occludens-1 (PDZ) interaction domains are crucial in reg- 
ulating the cell dynamics. They play a fundamental role in signaling pathways by organizing networks 
of receptors and in targeting selected cellular proteins to multi-protein complexes [37-40]. Most PDZ- 
mediated interactions occur through the recognition of C-terminal peptide motifs by a large binding cleft 
built in the PDZ fold (see Fig. 03). Interestingly, experimental evidence from NMR measurements strongly 
suggests that the dynamics of PDZ domains upon ligand binding show correlations over the entire protein 
structure: this confirms the crucial role of collective low-frequency modes in the biological functions of 
proteins [41]. 
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Table 1. 
tors. 



Values of the critical cutoff distance r* for different proteins and correlation with geometrical and structural indica- 



Protein 


PDB-id 


N 


V (10 4 A 3 ) 


V 


% flex residues 


(a + /?)-fract. 


rt (A) 


Insulin 


4INS 


51 


0.75 


0.20 


27.45 


0.53 


4.57 


Protein G 


1PGB 


56 


0.76 


0.21 


37.50 


0.70 


3.64 


Ubiquitin( a ) 


1UBI 


71 


1.04 


0.20 


35.21 


0.46 


3.53 


PDZ binding domain( a ) 


1BFE 


85 


1.20 


0.21 


30.00 


0.55 


4.03 


Lysozyme 


166L 


162 


2.75 


0.17 


43.21 


0.74 


4.27 


Adenylate Kinase 


4AKE 


214 


5.41 


0.12 


50.93 


0.64 


7.85 


LAO 


2LA0 


238 


4.36 


0.16 


47.06 


0.60 


5.44 


CYSB 


1AL3 


260 


4.40 


0.17 


41.15 


0.59 


4.70 


PBGD 


1PDA 


296 


5.33 


0.16 


41.22 


0.60 


3.70 


Thermolysin 


5TLN 


316 


5.02 


0.18 


40.51 


0.53 


4.55 


HSP70 ATP-binding domain 


3HSC 


382 


7.49 


0.15 


38.74 


0.66 


5.28 


Fab- fragment 


1AE6 


437 


9.57 


0.13 


45.08 


0.48 


5.70 


Serum Albumin 


1A06 


578 


14.55 


0.12 


47.06 


0.70 


5.70 


Correlation with r* 




0.45 


0.52 


-0.82 


0.67 


0.17 


1 



a In these proteins short terminal tails of a few residues have been cut out in order to recover the correct low-frequency 
modes of the structure. 
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Figure 3. Pattern correlation plots of the second slowest normal mode as calculated through the Gaussian ANM model for the PDZ 
binding domain. The regions characterized by the largest displacements are highlighted and referenced explicitly within the protein 
structure. The functional motion corresponds to LI and aB moving out of phase (cleft opening/closing). 



In a previous work [36], a thorough normal-mode analysis of the PDZ binding dynamics has been 
performed by means of the CHARMM all-atom force field and of a recent version of the ANM, the C/3- 
ANM [42]. In such a model, also C/3 carbons are considered and the complex chemical interactions between 
residues are described by springs connecting all Ca-Ca, Ca-C/3, and C/3-C/3 pairs whose distances in the 
native fold are smaller than r c = 7.5 A. The main result was that a single low- frequency mode captures 
the main spatial correlations characterizing the binding dynamics, namely a concerted opening of the 
hydrophobic pocket (see cartoon in FigBJ). 

In order to investigate the correlation of such functional pattern with the general phenomenon of struc- 
tural instability occurring as the density is lowered, we have looked for a mode with the same displacement 
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template within the Gaussian ANM model as a function of the parameter r c . We observe that the known 
pattern is recovered in the second slowest mode, but only at relatively low values of r c , whereas the mode 
looses cooperativity at larger values of the interaction cutoff. This is shown in Fig. 03 where we report 
correlation plots for r c = 2, 3.8 and 8 A. The correlation matrix C for mode k is defined as 

s~ik \ A tk tk 

a=x,y,z 

where is the j—th component in directon a of the k— th normalized eigenvector, so that —1 < < 1. 

Our results indicate that, as r c is lowered towards the structural instability, the mode pattern becomes 
more and more pronounced. In particular, the loop LI and the helix become rigid, anti-correlated 
units regulating the large-scale opening of the binding cleft. At the same time, the displacement of loop 
L3 correlated to LI for tighter structures (r c = 8 A) gets strongly reduced. This result is a clear signature 
of the high robustness of large-scale functional motions. However, in order to assess the generality of our 
inferences, the same analysis should be repeated on a large pool of protein structures with known functional 
dynamics. 



5 Conclusions and perspectives 

In this paper we have addressed the Boson peak phenomenology experimentally observed in proteins by 
means of a simple elastic network model. After identifying the interaction cutoff as a control parameter, 
we have checked numerically the analytical predictions of ERM theory, already successfully reproduced by 
glass-forming materials. The BP turns out to be an intrinsic feature of proteins spectra as it is for structural 
glasses. On the same line, it signals the presence of a mechanical instability of the protein structure which 
can be related to the unfolding transition. However, in order to establish a precise relationship between 
BP divergence and unfolding transition, a more systematic study based on a sharp-cutoff elastic network 
model should be carried out. 

To substantiate our interpretation of the BP emergence as related to the presence of an underlying 
mechanical instability, we have measured the correlation between the critical cutoff r* and structural 
properties of the protein. The overall result indicates that r* may serve as an indirect measure of global 
compactness of the protein, unrelated to the presence of local more stable motifs, such as secondary 
structures. Another fact to be stressed is that the critical r c is always smaller than 7-1-8 A, which is 
supposed to be the "correct" value for r c obtained by comparing elastic network models with molecular 
dynamics spectra. Moreover, r* is always greater than, or at worst of the order of, the distance between 
consecutive amino-acids along the backbone, ensuring that the loss of global stability does not interfere 
with the polymeric nature of proteins. As a rule of thumb, the larger r*, the more unstable is the protein. 

Taken together with the experimental fact that proteins are typically active in an environment whose 
local temperature is a few degrees smaller than the unfolding one, all these observations lead to a new 
interpretation of the surprising fact that proteins live more or less close to a phase transition. According to 
our analysis, in order to be efficient molecular machines able to perform their biological function, a protein 
has to keep a relative mechanical rigidity while being able to easily access the local minima directly 
connected to its native state. The BP is then the universal signature of such a trade-off. Interestingly, the 
observation that functional modes are those which "survive" in the critical regime strongly supports this 
interpretation. 

We think that, in order to establish a deeper connection between the presence of the BP in proteins 
and this kind of theoretical approach, more experimental work is needed. The supposed relationship be- 
tween the structural stability of a protein, its unfolding transition temperature, and its vibrational features 
certainly deserves a more detailed study. Furthermore, a systematic experimental approach to the BP phe- 
nomenology in proteins would allow to establish strength and limits of the ERM theory with respect to 
real systems. For the sake of concreteness, we would like to propose some ideas for experimental investi- 
gation of the BP phenomenon in proteins that may prove useful in probing the validity of our theoretical 
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framework. 

In general, high pressure is a useful tool for the study of protein structure and dynamics [43]. More 
precisely, the effect of pressure on proteins is two-fold. Moderate pressures (j 0.4 GPa) cause elastic 
alterations in the spatial structure of the proteins, while further increase in pressure can cause the loss of 
the secondary structure along with the biological inactivation and denaturation. A number of pressure- 
dependent spectroscopic measurements have revealed non-trivial response of protein structures to pressure- 
induced stress, such as secondary structure-specific volume fluctuations and structure stabilization [44]. 
Moreover, normal mode calculations have also been successfully employed in such a context, showing that 
contributions to volume fluctuations from low frequency normal modes are typically found to dominate 
over those from higher frequency modes [45]. 

In view of the above facts, we propose to perform pressure-dependent measurements in the low-to mod- 
erate pressure regime to investigate the corresponding effect of density fluctuations on the BP frequency. 
In particular, combining vibrational spectroscopy with global experimental probes, such as adiabatic com- 
pressibility measurements and small angle X-ray scattering, should enable one to investigate experimentally 
the correlations between the critical interaction cutoff r* and non-local structural indicators such as the 
packing coefficient. 
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